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IDENTIFYING LOCALLY OPTIMAL DESIGNS FOR NONLINEAR 
MODELS: A SIMPLE EXTENSION WITH PROFOUND 
CONSEQUENCES 

By Min Yang 1 and John Stufken 2 

University of Illinois at Chicago and University of Georgia 

We extend the approach in [Ann. Statist. 38 (2010) 2499-2524] for 
identifying locally optimal designs for nonlinear models. Conceptually 
the extension is relatively simple, but the consequences in terms of 
applications are profound. As we will demonstrate, we can obtain 
results for locally optimal designs under many optimality criteria and 
for a larger class of models than has been done hitherto. In many 
cases the results lead to optimal designs with the minimal number of 
support points. 

1. Introduction. During the last decades nonlinear models have become 
a workhorse for data analysis in many applications. While there is now an 
extensive literature on data analysis for such models, research on design 
selection has not kept pace, even though there has seen a spike in activity 
in recent years. Identifying optimal designs for nonlinear models is indeed 
much more difficult than the much better studied corresponding problem for 
linear models. For nonlinear models results can typically only be obtained 
on a case-by-case basis, meaning that each combination of model, optimality 
criterion and objective of the experiment requires its own proof. 

Another challenge is that for a nonlinear model an optimal design typ- 
ically depends on the unknown parameters. This leads to the concept of 
locally optimal designs, which are optimal for a priori chosen values of the 
parameters. The designs may be poor if the choice of values is far from 
the true values. Where feasible, a multistage approach could help with this. 
A small initial design is then used to obtain some information about the 
parameters, and this information is used at the next stage to estimate the 
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true parameter values and to extend the initial design in a locally optimal 
way to a larger design. The design at this second stage could be the final 
design, or there could be additional stages at which more design points are 
selected. The solution presented in this paper is applicable for a one-shot 
approach for finding a locally optimal design as well as for a multistage ap- 
proach. The argument that our method can immediately be applied for the 
multistage approach is exactly as in Yang and Stufken (2009). 

For a broader discussion on the challenges to identify optimal designs 
for generalized linear models, many of which apply also for other nonlinear 
models, we refer the reader to Khuri et al. (2006). 

The work presented here is an extension of Yang and Stufken (2009), Yang 
(2010) and Dette and Melas (2011). The analytic approach in those papers 
unified and extended many of the results on locally optimal designs that were 
available through the so-called geometric approach. The extension in the 
current paper has major consequences for two reasons. First, it enables the 
application of the basic approach in the three earlier papers to many models 
for which it could until now not be used. As a result, this paper opens the 
door to finding locally optimal designs for models where no feasible approach 
was known so far. Second, for a number of models for which answers could 
be obtained by earlier work, the current extension enables the identification 
of locally optimal designs with a smaller support. This is important because 
it simplifies the search for optimal designs, whether by computational or 
analytical methods. Section 4 will illustrate the impact of our results. 

The basic approach in Yang and Stufken (2009), Yang (2010) and Dette 
and Melas (2011), which is also adopted here, is to identify a subclass of 
designs with a simple format, so that for any given design £, there exists 
a design £* in that subclass with 1^* > 1^ under the Loewner ordering. We 
will refer to this subclass as a complete class for this problem. Here, I^* 
and 1% are information matrices for a parameter vector 9 under £* and £, 
respectively. Others, such as Pukelsheim (1989) have called such a class es- 
sentially complete, which is admittedly indeed more accurate, but also more 
cumbersome. When searching for a locally optimal design, for the common 
information-based optimality criteria, including A-, D-, E- and <3? p -criteria, 
one can thus restrict consideration to this complete class, both for a one- 
shot or multistage approach. Also, as shown in Yang and Stufken (2009), this 
conclusion holds for arbitrary functions of the parameters. Ideally, the same 
complete class results would apply for all a priori values of the parameter 
vector 9. However, it turns out, as we will see in Section 4, that there are in- 
stances where complete class results hold only for certain a priori values of 9. 

Yang and Stufken (2009), Yang (2010) and Dette and Melas (2011) iden- 
tify small complete classes for certain models. They do so by showing that 
for any design £ that is not in their complete class, there is a design £* that 
is in the complete class such that all elements of 1^* are the same as the 
corresponding elements in 1^, except that one diagonal element in I^* is at 
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least as large as that in 1^. This guarantees of course that If* > 1^. The con- 
tribution of this paper is that we focus on increasing a principal submatrix 
rather than just a single diagonal element. This allows us to obtain results 
for more models than could be addressed by Yang and Stufken (2009), Yang 
(2010) and Dette and Melas (2011), and also facilitates the identification of 
smaller complete classes for some models considered in these earlier papers. 

In Section 2 we will present the necessary background, while the main 
results are featured in Section 3. The power of the proposed extension is 
seen through applications in Section 4. We conclude with a short discussion 
in Section 5. 

2. Information matrix and approximate designs. Consider a nonlinear 
regression model for which a response variable y depends on a single regres- 
sion variable x. We assume that the y's are independent and follow some 
exponential distribution G with mean n(x, 9), where 9 is the px 1 parameter 
vector, and the values of x can be chosen by the experimenter. Typically 
approximate designs are used to study optimality in this context. An approx- 
imate design £ can be written as £ = {(xi,Ui),i = 1, . . . , N}, where oji > 
is the weight for design point Xi and = 1- ^ is often more conve- 

nient to present £ as £ = {(ci,uji),i = 1,...,N}, Cj S L4, B], with the Cj's 
obtained from the Xj's through a bijection that may depend on 9. Typically, 
the information matrix for 9 under design £ can be written as 

(2.1) lz(0) = P(9) feuiCip,*)) (P(9)) T , 



where 

(2.2) C(9,c) 



/*n(c) \ 

*2l(c) ^22 (c) 



The functions VP are allowed to depend on 9 not just through c, but in 
an attempt to simplify notation we write, for example, $n(c) rather than 
^n(9,c). In (2.2), C(9,c) is a symmetric matrix, and P{9) is a p x p non- 
singular matrix that depends only on 9. Some examples of (2.1) and (2.2) 
will be seen in Section 4. 

For some pi, 1 < p\ < p, we partition C(9, c) as 

'Cn(c) Cl{c) 
C 21 (c) C 22 {c) 

Here, C 22 (c) is the lower p\ x p\ principal submatrix of C(9,c), that is, 

(*p-pi-I-1,p-pi+i( c ) ••• ^p-pi+l,p(c)' 
; ■■. ; 
* P ,p- Pl +i(c) ••• M'pp(c) 



(2.3) C(0,c) 
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In the context of local optimality, if designs £ = {(cj, cui), i = 1, . . . , N} and 

i={(cj,uij),j = 1,...,N} satisfy Ya=i ^(0, Ci) < J2iLi UiC(9, c$), then it 
follows from (2.1) that I ( (8) < 1^(9). Hence, I £ (0) < 1^(9) follows if it holds 
that 



(2.5) 



N 


N 


y~]<jJiCn(ci) 




i=l 


i=i 


N 


N 




= y^ j U)jC 1 2{c i ) and 


i=i 


i=i 


N 


N 




< y^a>iC22(ci). 


1=1 


i=i 



This is what we explore in this paper. Note that this is more general than 
Yang and Stufken (2009), Yang (2010) and Dette and Melas (2011), where 
Pi = 1 . We develop a theoretical framework for general values of p\ . 

3. Main results. Following Karlin and Studden (1966) and Dette and 
Melas (2011), a set of k + 1 real-valued continuous functions uq, . . . ,uj- de- 
fined on an interval [A, B] is called a Chebyshev system on [A, B] if 



(3.1) 



^o(^o) u (zi) 
ui(z ) ut(zi) 

Uk(z ) u k (zi) 



uo{z k ) 
ui(zk) 

Uk(Zk) 



is strictly positive whenever A < zq < z\ < ■ ■ ■ < z k < B. 

Along the lines of Yang (2010), we select a maximal set of linearly in- 
dependent nonconstant functions from the VP functions that appear in the 
first p — pi columns of the matrix C(6,c) defined in (2.2), and rename the 
selected functions as . . . , ^k—i- For a given nonzero p\ x 1 vector Q, let 



(3.2) 



n = Q T C 2 2(c)Q, 



where 622(c) is as defined in (2.4). 

For \&o = 1, V&i, . . . , ^fe-i and 022(c), we will say that a set of n\ pairs 
(ci,uji) is dominated by a set of 712 pairs (cj,u)j) if 

(3.3) £ Wi^ite) 1 = 0,1,. ..,k-l; 

i i 

(3.4) y^^j^^Cj) < ^^oji^2(ci) f° r every nonzero vector Q, 
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where the summations on the left-hand sides are over the n\ subscripts for 
the pairs (q,Wj) and those on the right-hand sides over the ri2 subscripts for 
the pairs (£j, cDj). 

The following two lemmas provide the basic tools for the main results. 
We point out that the pairs (cj,Wj) in these lemmas need not form a design; 
in particular, the uVs need not add to 1. 

Lemma 1. For the functions = 1,^%, . . . , ^>k-i, defined on an in- 
terval [A,B], suppose that either 

{*o,*i,...,**-l} and {¥ 0) *i, ...,* fc _i,*J} 

(3.5) 

form Chebyshev systems for every nonzero vector Q 

or 

{* , and {* ,*l, ■■■,**-!,-*?} 

(3.6) 

/orm Chebyshev systems for every nonzero vector Q. 
Then the following conclusions hold: 

(a) For k = 2n — 1, i/ (3.5j holds, then for any set S± = {(cj,Wj) :cjj > 0, 
i = 1, . . . , n} toitA ^4 < ci < • • • < c n < £>, i/iere exists a set S2 = {(cj,u)j) : 
u>i > 0, i = 1, . . . , n} with c\ < c\ < C2 < • • • < c n _i < c n < c n = B, such that S\ 
is dominated by S2. 

(b) For k = 2n — 1, if (3.6) holds, then for any set Si = {(Q,Wj) :cjj > 0, 
i = 1, . . . ,n} with A < c\ < ■ • ■ < c n < B, there exists a set S2 = {(c~i,oji) : 
uji > 0, i = 0, . . . , n — 1} with A = Co < c\ < c\ < c<i < • • • < c n -\ < c n , such 
that Si is dominated by S%. 

(c) For k = In, if (3.5) holds, then for any set S\ = {(cj,Wi) :uii > 0, 
i = 1, . . . ,n} with A < c\ < ■ • • < c n < B, there exists a set S2 = {(c~i,Cji) : 
uji > 0, i = 0, . . . , n} with A = cq < c\ < c~\ < • • • < c n < c n = B, such that S\ 
is dominated by S2. 

(d) For k = 2n, if (3.6) holds, then for any set S\ = {{ci,0Ji),LOi > 0, 
i = 1, . . . , n + 1 with A < c\ < ■ ■ ■ < c n +i < B, there exists a set S2 = {(ci,u>i) : 
u>i > 0, i = 1, . . . , n} with c\ < c\ < ■ ■ ■ < c n < c n < c n+ i , such that S\ is dom- 
inated by S2- 

Proof. Since the proof is similar for all parts, we only provide a proof 
for part (a). 

Let S\ be as in part (a). First consider the special case that Q=(l, 0, . . . , 0) 2 
By (la) of Therorem 3.1 in Dette and Melas (2011), there exists a set of 
at most n pairs (cj,a)j) with one of the points equal to B so that (3.3) 
and (3.4) hold for this Q. By part (a) of Proposition 1 in the Appendix, 
the number of distinct points with cDj > must then be exactly n. Thus we 
have c\ < ■ ■ ■ <c n = B, and the Cj's and q's must alternate by part (b) of 



G 



M. YANG AND J. STUFKEN 



Proposition 1. The result follows now for an arbitrary nonzero Q by applying 
Proposition 2 in the Appendix and using (3.5) and (3.4). □ 

Lemma 2 partially extends Lemma 1 by observing that larger sets S\ than 
in Lemma 1 are also dominated by sets S2 as in that lemma. 

Lemma 2. With the same notation and assumptions as in Lemma 1, let 
S\ = {(cjjWj) :uji > 0, A < Ci < B, i = 1, . . . ,N}, where N >n for cases ( a), 
(b), and (c) of Lemma 1, and N > n + 1 for case (d). Then the following 
conclusions hold: 

(a) For k = In — 1, if (3.5) holds, then S\ is dominated by a set S2 of 
size n that includes B as one of the points. 

(b) For k = 2n — 1, if (3.6) holds, then Si is dominated by a set S2 of 
size n that includes A as one of the points. 

(c) For k = In, if (3.5) holds, then Si is dominated by a set S2 of size 
n + 1 that includes both A and B as points. 

(d) For k = 2n, if (3.6) holds, then S± is dominated by a set S2 of size n. 

Proof. The results follow by application of Lemma 1. For example, for 
case (a), if N = n, the result follows directly from Lemma 1. If N > n, we 
start with the points c\ < C2 < • • • < cm in S\. Using Lemma 1, we obtain 
points c\, . . . , CN-m CN-n+i, ■ ■ ■ , c~N = B in a set Si that dominates Si. Using 
Lemma 1 again on the n largest points other than c^ in Si, we move one 
more point to B, obtaining a new set with iV — 1 points that dominates S\. 
Continue until the size of the set is reduced to n; this is the desired set S2. 
□ 

The first main result is an immediate consequence of Lemma 2. 

Theorem 1. For a regression model with a single regression variable x, 
suppose that the information matrix C(6,c) can be written as in (2.1) for c € 
L4,.B]. Partitioning the information matrix as in (2.3), let V&i, . . . ,^-1 be 
a maximum set of linearly independent nonconstant ^ functions in the first 
p — pi columns ofC(9,c). Define as in (3.2). Suppose that either (3.5) 
or (3.6) in Lemma 1 holds. Then the following complete class results hold: 

(a) For k = 2n — \, if (3.5) holds, the designs with at most n support 
points, including B, form a complete class. 

(b) For k = 2n — \, if (3.6) holds, the designs with at most n support 
points, including A, form a complete class. 

(c) For k = 2n, if (3.5) holds, the designs with at most n + 1 support 
points, including both A and B, form a complete class. 

(d) For k = 2n, if (3.6) holds, the designs with at most n support points 
form a complete class. 
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Note that if (3.3) holds for ^i(c), I = 1, . . . , k — 1, then the same is true 
if we replace one or more of the *&i's by —^i- Therefore, if (3.5) or (3.6) 
do not hold for the original ^>i's, conclusions in Theorem 1 would still be 
valid if (3.5) and (3.6) hold after multiplying one or more of the \EVs, I = 
l,...,k-l, by -1. 

While Theorem 1 is very powerful, applying it directly may not be easy. 
The next result, which utilizes a generalization of a tool in Yang (2010), will 
lead to a condition that is easier to verify. Using the notation of Theorem 1 , 
define functions fn, 1 < t < k;t < I < k as follows: 

%(c), if t = l, l = l,...,k-l, 

C' 22 {c), iit = l,l = k, 

(3 - 7) ^ (C)= / /M-!(C) 



/t_i jt _i(c) 



if 2 < t < k, t < I < k. 



The following lower triangular matrix contains all of these functions, and 
suggest an order in which to compute them: 





ffiA 










= % 


/2,2 


(3.8) 


fi,l 


= % 


/3,2 




\fk,l 


= ^22 


fk,2 



\ 



' fk.i \i f / fk,2 y : f / /fc,fc~i y 

<h,i' Jk,3-\ h2 ) ■ Jk,k — Kf^^^J I 

Note that, for p\ > 2, the functions in the last row are matrix functions, 
which is a key difference with Yang (2010). The derivatives of matrices 
in (3.7) are element-wise derivatives. For the next result, we will make the 
following assumptions: 

(i) All functions in the information matrix C(6,c) are at least feth 
order differ entiable on (A, B). 

(ii) For 1 < I < k — 1, the functions fi t i(c) have no roots in [A, B\. 

For ease of notation, in the remainder we will write fij instead of fij(c), 
and fn > means that fi t i{c) > for all c £ [A-B]. This also applies for 
I = k, in which case it means that the matrix & is positive definite for all 
cG [A,B]. 

Theorem 2. For a regression model with a single regression variable x, 
let cG [A,B], C(9,c), vE^, . . . , ^k-l an d be as in Theorem 1. For the 
functions fij in (3.7), define F{c) = Yli = ifi,i, cG -B]. Suppose that ei- 
ther F(c) or —F(c) is positive definite for all c G [^4,-B]. Then the following 
complete class results hold: 
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(a) For k = 2n — \, if F{c) > 0, the designs with at most n support points, 
including B, form a complete class. 

(b) For k = 2n — \, if —F{c) > 0, the designs with at most n support 
points, including A, form a complete class. 

(c) For k = 2n, if F{c) > 0, the designs with at most n + 1 support points, 
including both A and B, form a complete class. 

(d) For k = 2n, if —F{c) > 0, the designs with at most n support points 
form a complete class. 

Proof. We only present the proof for case (a) since the other cases are 
similar. For any nonzero vector Q, Q T F(c)Q > for all cG [A, B]. Among 
all fi t i, I = 1, . . . , k — 1, and Q T fk,kQ, suppose that a of them are negative. 
Let 1 < l± < ■ ■ ■ < l a < k denote the subscripts for these negative terms, and 
note that a must be even. Note also that the labels l\ < ■ ■ ■ < l a do not 
depend on the choice of the vector Q since /x,x, ■ • • , fk—ik—i do not depend 
on Q. Finally, note that for any I with 1 < I < k — 1, if we replace ^i(c) 
by — ^z(c), then the signs of fu and /;_(_x,z+i are switched while all others 
remain unchanged. 

We now change some of the ^;'s to —*$>i. This is done for those I that 
satisfy Z26-1 < I < hb for some value of b S {1, . . . , a/2}. Denote the new 
functions by {1, Notice that ^ = fy® . From the last observa- 
tion in the previous paragraph, it is easy to check that fij > 0, 1 = 1,..., k, 

for the functions fij that correspond to this new set of ^-functions. By 

Proposition 4 in the Appendix, {l,$i, . . . ,*fc-x} and {l,*x, • • - ,*fc-l,*fc } 
are Chebyshev systems on [A, B], regardless of the choice for Q 7^ 0. The re- 
sult follows now from case (a) of Theorem 1 and the observation immediately 
after Theorem 1. □ 

For case (a) in Theorem 2, the value of A in the interval [A, B] is allowed to 
be —00. In this situation, for any given design £, we can choose A = minjCj, 
and the conclusion of the theorem holds. Similarly, B can be 00 in case (b), 
and the interval can be unbounded at either side for case (d). 

As noted at the end of Section 2, the results in Yang and Stufken (2009), 
Yang (2010), and Dette and Melas (2011) correspond to p\ = 1. The ex- 
tension in this paper allows the choice of larger values of p\ where feasible. 
Larger values of p\ lead to designs with smaller support sizes. The reason 
for this is that the value of k in Theorems 1 and 2 corresponds to the num- 
ber of equations in (3.3). For a particular model, this number is smaller for 
larger p\. Since the support size of the designs is roughly half the value of k, 
the support size is smaller for larger values of p\ . 

We will provide some examples of the application of Theorems 1 and 2 in 
the next section, and will offer some further thoughts on the ease of their 
application in Section 5. 
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4. Applications. Whether the model is for continuous or discrete data, 
with homogeneous or heterogeneous errors, Theorems 1 and 2 can be applied 
as long as the information matrix can be written as in (2.1). As the examples 
in this section will show, in many cases the result of the theorem facilitates 
the determination of complete classes with the minimal number of support 
points. 



4.1. Exponential regression models. Dette, Melas and Wong (2006) stud- 
ied exponential regression models, which can be written as 

L 



(4.1) 



Y, 



E 



aie 



where the e^'s are i.i.d. with mean and variance <r 2 , and Xj E [U, V] is the 
value of the regression variable to be selected by the experimenter. Here 9 = 
(oi, . . . ,ol, Ai, . . . ,Xl) T , with aiy^O, I = 1, . . . , L, and < Ai < • • • < Xl- For 
L = 2, they showed that there is a D-optimal design for 9 = (a±, a 2 , Ai, X 2 ) T 
based on four points, including the lower limit U. Further, for L = 3 and 
A2 = (Ai + Xs)/2, they showed that there is a .D-optimal design for 9 based 
on six points, again including the lower limit U. By using Theorem 2, we 
will show that similar conclusions are possible for other optimality criteria, 
including A- and S-optimality, and other functions of interest for many 
a priori values of 9. 

For L = 2, the results in Yang (2010) can be used to obtain a com- 
plete class of designs with at most five points. We can do better with 
Theorem 2. The information matrix for 9 = (01,02, Ai, A2) under design 
{(xi,0Ji),i = 1, ...,JV} can be written in the form of (2.1) with P(9) = 
diag(l, 1,^,^) and 



A2— Ai ' A2 
/ 

(4.2) C(9,c) = 



c 



„A+2 



where 
* 3 (c) = 



log (c)c A 
V log (c)c A+1 

c = e -(A 2 -Ai)x and A 

= c A+1 , * 4 (c 



log(c)c A+1 
log (c)c A+2 



\ 

log 2 (c)c A 
log 2 (c)c A+1 log 2 (c)c A+2 / 



2Ai 



Then /1 



C 22 (c) 
Ac A - x , /; 



A2 — Ai 

l0g( C )c A+1 , * 5 ( C ) : 

log 2 (c)c A 

,A+1 



Let *i(c) 



* 2 (c) = log(c)c\ 



„A+2 



^ 6 (c) = log (c)c A+2 and 



c C 



(2,2 



log^ 



f — A+l f 
J 3,3 — "X" ' /' 



log 2 (c)c A+1 

2 ( 
1 



log 2 (c)c A+2 



4,4 



/7,7(C) 



2A 



(A + 2)c 3 
A + l 
\2(A + 2)c 2 



h,5 — 
A + l \ 



4 ( A + 2 ) ft.. 
A+l ' J6,t 



- and 



2(A + 2)c 2 
2 
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< 



/960+30 
/960-30' 



Note that c > and A > 0, so that F(c) is positive definite if \f7j(c)\ > 0. 
This is equivalent to 15 A 2 + 30 A — 1 > 0, which is satisfied when 
Thus, by (a) of Theorem 2, we have the following result. 

Theorem 3. For Model (4.1) with L = 2, if 



A 2 



< 



/960 + 30 



61.98, 



Ai V960 - 30 

then the designs with at most four points, including the lower limit U , form 
a complete class. 

For L = 3 and 2X2 = Ai + A3, the information matrix for 9 = (oi,a2) a 3) 
<^i;^2 5 A3) T under design {(xi,Ui),i = 1, . . . ,N} can be written in the form 



of (2.1) with P{9) = diag(l, 1, 1 



a-2 



_02_ 



' A2— Ai ' A2 — Ai ' A2 — Ai 



) and 



(4.3) C(9,c) 



C A + 1 
„A + 2 



c X + 2 

c A + 3 c A + 4 

log(c)c A log(c)c A+1 log(c)c A + 2 log 2 (c)c A 

log(c)c A+1 log(c)c A + 2 log(c)c A + 3 log 2 (c)c A+1 log 2 (c)c A+2 

\log(c)c A + 2 log(c)c A + 3 log(c)c A + 4 log 2 (c)c A + 2 log 2 (c)c A + 3 log 2 (c)c A + 4 / 



where c = e"^-^^ and A = xt=a7- Let *2«-i(c) = c 
log (c)c 



A+«-l 



A+i— 1 



A2— Ai 

1, . . . , 5, and let 

„2^„\„A 



and ^2i( c ) 



Then h,i 
and 



C 2 2(C)- 

= Xc x -\ 



log 2 (c)c A log 2 (c)c A+1 
log 2 (c)c A+1 log 2 (c)c A + 2 
log 2 (c)c A+2 log 2 (c)c A+3 



log 2 (c)c A + 2 ' 
log 2 (c)c A + 3 
log 2 (c)c A+4 . 



f2l,2l 

( 



±, 1 = 1,2,3,4,5, / 2f+ i,a+i = 5^?, i = 1,2,3,4, 



2A 



/n,u(c) 



A+l 



(A + 4)c 5 

A + l 
8(A + 4)c 4 
A + 2 
V 18(A + 4)c 3 



8(A + 4)c 4 

A + 2 
18(A + 4)c 3 

A + 3 



A + 2 \ 
18(A + 4)c 3 

A + 3 
8(A + 4)c 2 
2 

c J 



8(A + 4)c 2 c 

Again, c > and A > 0, so that F{c) is positive definite if |(/n,n(c))| and 
its leading principal minors are positive. This is equivalent to 

11499A - 1082 > 0, 

55A 2 + 110A-9>0, 

1295A 2 + 5180A - 4 > 



1505A 3 + 9030A 2 



(4.4) 



and 55A 2 + 330A + 431 > 0. 
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Simple computation shows that this holds for ^ < 23.72 (or, equivalently, 
^ < 46.45). By Theorem 2, we have the following result. 

Theorem 4. For model (4.1) with L = 3 and 2A 2 = Ai + A3, if ^ < 
23.72, £/ien i/ie designs with at most six points, including the lower limit U, 
form a complete class. 

4.2. LINEXP model. Demidenko (2006) proposed a model referred to 
as the LINEXP model to describe tumor growth delay and regrowth. The 
natural logarithm of the tumor volume is modeled as 

(4.5) y i = a + 7x i + /3(e- fe4 -l)+e i , 

with independent £j ~ N(0,a 2 ) and Xi G [£/, V] as the value of the single 
regression variable, which in this case refers to time. Here 9 = (a, 7, /3, <5) T 
is the parameter vector, where a is the baseline logarithm of the tumor 
volume, 7 is the final growth rate and S is the rate at which killed cells get 
washed out. The size of the parameter f3 relative to j/5 determines whether 
regrowth is monotonic (/? < or not. Li and Balakrishnan (2011) recently 
studied this model and showed that a D-optimal design for 6 can be based on 
four points, including U and V . We will now show that Theorem 2 extends 
this conclusion to other optimality criteria and functions of interest. 

The information matrix for 6 under design {(xj, oj{), i = 1, . . . , N} can be 
written in the form of (2.1) with 



P{9) 

(4.6) 

C(9,c) 



f 1 
1 



Vo 



) 





) 1 





5 





) 


5/P 



and 



1 

e c e 2c 



c 2 



ce 2c c 2 e c c 2 e 2c 



where c= — Sx. With a proper choice of ^ functions, it can be shown that 
the result in Yang (2010) yields a complete class of designs with at most five 
points, including U and V. We can again do better with Theorem 2. 
Define ^i(c) = c, ^ 2 (c) = e c , ^ 3 (c) = ce c , ^ 4 (c) = e 2c , ^ 5 (c) = ce 2c and 



c 2 c 2 e c 
c 2 e c c 2 e 2c 



C 2 2(c) - 

This yields = 1, f 2 ,2 = e c , / 3 ,3 = 1, /4,4 = 4e c , / 5)5 = 1 and 

'2e~ 2c e"72 s 



/6.6(c) 



72 
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Clearly F{c) is a positive definite matrix. Therefore, by part (c) of Theo- 
rem 2, we reach the following conclusion. 

Theorem 5. For the LINEXP model (4-5), the designs with at most 
four points, including U and V , form a complete class. 

4.3. Double- exponential regrowth model. Demidenko (2004), using a two- 
compartment model, developed a double-exponential regrowth model to de- 
scribe the dynamics of post-irradiated tumors. The model can be written 

as 

(4.7) Yi = a + ln[/3e^ + (1 - /3)e"^] + e y , 

with independent £j ~ N(0,a 2 ) and Xj G [U, V] again as the value for the 
variable time. Here 9 = (a,j3,i>, (j)) T is the parameter vector, where a is 
the logarithm of the initial tumor volume, < /3 < 1 is the proportional 
contribution of the first compartment and v and (j) are cell proliferation and 
death rates. 

Using Chebyshev systems and an equivalence theorem, Li and Balakr- 
ishnan (2011) showed that a L>-optimal design for 9 can be based on four 
points including U and V. Theorem 1 allows us to extend this result to 
a complete class result, thereby covering many other optimality criteria and 
any functions of interest. 

The information matrix for under design {(xi,u>i),i = 1, . . . , N} is of the 
form (2.1) with 



P(0) 



/l 

1 1-/3 

1//3 

\0 -1/(1 -/3). 

and with C(9,x) a 4 x 4 matrix as in (2.2), where $n = 1, ^21 — & ux /g(x), 
*22 = e 2 » x /g 2 (x), * 31 = xe» x /g(x), * 3 2 = xe 2 ™/g 2 (x), * 33 = x 2 e 2 » x / g 2 (x), 
* 41 =xe-* x /g(x), ^42 = xe { y~^ x jg 2 {x), * 43 = x 2 e^~^ x /g 2 (x) and f 44 = 
x 2 e- 2(f,x /g 2 (x). Here, g(x) = f3e ux + (1 - /3)e-* x . Note that ^ 42 can be writ- 
ten as a linear combination of ^31 and ^32. We can apply Theorem 1 if 
we can show that both {1, #21, *22, *4i, -*3i, ^32} and {1, * 2 1, *22, *4l, 
— ^31, \&32, Q T C22(x)Q} are Chebyshev systems for any nonzero vector Q, 

where Q a (x) = (*» 

Rather than do this directly, we first simplify the problem. We multi- 
ply each of the ^'s by the positive function e 2< ^ x g(x) 2 , which preserves the 
Chebyshev system property. After further simplifications by replacing some 
of the resulting functions by independent linear combinations of these func- 
tions, which also preserves the Chebyshev system property, we arrive at 
the systems {1, e^ + *> , e 2 ^ + ^ x , x, -xe^ + ^ x , se 3 ^} and {1, e ^+^ x , 
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e 2 (»+® x , x, -xe( u+ ^ x , xe 2 ^+^ x , g 2 (x)e 2 ^ x Q' 1 'C 22 (x)Q} . It suffices to show 
that these are Chebyshev systems for any nonzero vector Q, which follows 
from Proposition 4 if we show that fn > 0, I = 1, . . . , 6, for the latter system. 
It can be shown that fi t i = /2,2/2 = 2/4,4 = /s,5/4 = ae ax , fe^ = e~ 2ax and 

fe,6 = ( e -ax/2 2e~ 2 <"= ) ' wnere a = v + 4>- Thus both systems are Chebyshev 
systems, and by part (c) of Theorem 1, we reach the following conclusion. 

Theorem 6. For the double- exponential regrowth model (4-7), the de- 
signs with at most four points, including U and V , form a complete class. 

5. Discussion. We have given a powerful extension of the result in Yang 
(2010) that has potential for providing a small complete class of designs 
whenever the information matrix can be written as in (2.1). Irrespective 
of the optimality criterion (provided that it does not violate the Loewner 
ordering) and of the function of 9 that is of interest, the search for an optimal 
design can be restricted to the small complete class. As the examples in 
Section 4 show, the results lead us to conclusions that were not possible 
using the results in Yang (2010) and Dette and Melas (2011). 

As already pointed out, direct application of Theorem 1 may not be easy. 
Section 4.3 shows some tricks that can be useful when using Theorem 1. 
Direct application of Theorem 2 is easier because the condition for the func- 
tion F(c) can be verified with the help of software for symbolic computations. 
Sometimes it is more convenient to do this after multiplying each of the \I/ 
functions by the same positive function (see Section 4.3). 

There remain, however, some basic questions related the application of 
either Theorem 1 or Theorem 2 that do not have simple general answers. For 
example, what is a good choice for p\ in forming the matrix 022(c) in (2.4)? 
In Section 4, the choice pi =p/2 worked well, and selecting p\ approximately 
equal to p/2 may be a good general starting point. Moreover, there is the 
question of how to order the rows and columns of the information matrix. 
By reordering the elements in the parameter vector 9, we could wind up with 
different matrices C 2 2(c), even after fixing p\. So what ordering is best? In all 
of the examples in Section 4, we have used an ordering that makes "higher- 
order terms" appear in C 2 2(c), and this may offer the best general strategy. 
There is still another issue related to ordering: In renaming the independent 
^-functions in the first p — pi columns of C(9,c), different orders will result 
in different //^-functions. In some cases, but not for all, these functions will 
result in a function F(c) that satisfies the condition in Theorem 2. In the 
examples, we have tended to associate "lower-order terms" with the earlier 
^-functions, but what order is best may require some trial and error. 

Whereas we have demonstrated that the main results of the paper are 
powerful, regrettably we cannot offer any guarantees that they will always 
give results as desired, even when the information matrix can be written in 
the form (2.1). 
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APPENDIX 

Proposition 1. Assume that {^! , . . . , ^k-i} is a Chebyshev sys- 
tem defined on an interval [A,B]. Let A < z\ < z 2 < • • • < zt < B, and let 
r±,...,rt be coefficients that satisfy the following k equations: 
t 

(A.l) ^7^(^ = 0, 1 = 0,1,. ..,k-l. 

i=l 

Then we have: 

(a) If t < k, then ri = 0,i = 1, . . . ,t. 

(b) If t = k + 1 and one ri is not zero, then all are nonzero; moreover 
all ri 's for odd i must then have the same sign, which is opposite to that of 
the ri 's for even i. 

Proof. For part (a), if t < k, we can expand z\,...,%t to a set of k 
distinct points, taking r. L = for the added points. Thus without loss of 
generality, take t = k. Consider the matrix 

(A.2) ^{z l ,z 2 ,...,z k )= . . . 

Then (A.l) can be written as 

^(z 1 ,z 2 ,...,z k )R = 0, 

where R = (n, . . . , r k ) T . Since {^q,^>\, . . . ,^>k-i} is a Chebyshev system, 
^(z\,z 2 , . . . ,z k ) is nonsingular, so that R = 0. 

For part (b), if one ri is 0, then it follows from part (a) that all r^s are 0. 
Therefore, if at least one ri is nonzero, then all of them must be nonzero. 
With the notation from the previous paragraph, we can write (A.l) as 

^(z 1 ,z 2 , z k )R = -r k+1 ip(z k+1 ), 
where ip(z k+1 ) = (W (z k+1 ), • • • , ^ k -i{z k +i)) T ■ It follows that 

(A.3) ri = _ r J^,---^,z k+1 ,z t+ , ,z k )\ ^ 

\^(z 1 ,z 2 ,...,z k )\ 

By the Chebyshev system assumption, the denominator z 2 , ■ . ■ , z k )\ 

in (A.3) is positive, while the numerator |^(-2i, . . . , Zi-±, z k +x, Zi + ±, . . . , z k )\ is 
positive for i = k, k — 2, . . . and negative otherwise. The result in (b) follows. 
□ 

Proposition 2. Let {^0 = 1 , ^1 , . . . , ^k-l} be a Chebyshev system on 
an interval [A,B], and suppose that k = 2n — 1. Consider n pairs (cj,cjj), 
i = 1, . . . ,n, and n pairs (cj,u)j), i = 1, . . . ,n, with coi > 0, u>i > and A < 
c\ < c~\ < ■ ■ ■ < c n < c n = B. Suppose further that the following k equations 
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hold: 
(A.4) 



^u^iia) = ^Cji^i(ci), l = 0, 1, . . . , k - 1. 



Then, for any function *$>k on [A, B\, we can conclude that 
(A.5) ^Ui^kia) <J2^k(ci) 

i i 

if {^o = lj ^l; • • • i ^k-i, *fc} is also a Chebyshev system. 
Proof. With 

R — (^1,-001,^2, — U>2, ■ ■ ■ )W n ) T , 

the k equations in (A.4) can be written as 

(A.6) <S>(c 1 ,c 1 ,...,c n )R = u n 'ip(c n ), 

where ^ and ip are as defined in the proof of Proposition 1. Further, (A.5) 
is equivalent to 

(A.7) (^fc(ci), ^fc(ci), . . . , ^k{c n ))R < OJn^kiCn)- 

Using (A.6) to solve for R, and using that £j n > 0, we see that (A.7) is 
equivalent to 

(A.8) (tf fe (ci), * fc (ci), . . • , ^(Cn))*"^ )V'(c n )-* fc (c n )<0. 

From an elementary matrix result [see, e.g., Theorem 13.3.8 of Harville 
(1997)], the left-hand side of (A.8) can be written as 

|**(ci,ci,...,c n ,c n )| 



(A.9) 
where 

(A.10) 



|#(ci,Ci, . . .,Cr, 



ty*(ci,ci,. . . ,c n ,c n 



( * (ci) *o(ci) 
^i(ci) *i(ci) 



* (Cn) *o(Cn) \ 

^i(c n ) ^i(c n ) 



*fc_l(c n ) 
*fc(c„) *fc(c n ) / 



**_i(ci) ^fc-l(ci) 
V * fc (ci) ^ fc (ci) 

Since both {^ , - • ■ , and {* , • • • , *fe-i, *fc} are Chebyshev 

systems and c\ < c\ < • • • < c n < c n , it follows that (A.9) is negative, which 
is what had to be shown. □ 

A similar argument as for Proposition 2 can be used for the next result. 

Proposition 3. Let {*o = l)^'i,--->*fc-i} be a Chebyshev system on 
an interval [A,B] and suppose that k = In. Consider n pairs (cj,cjj), i = 
1, ... ,n, and n + 1 pairs (cj,a)j), i = 0, 1, . . . ,n, with oj{ > 0, <Dj > and 
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A = Co < c\ < c\ < ■ ■ ■ < c n < c n = B . Suppose further that the following k 
equations hold: 

(A.ll) ^w i *,(c t ) = ^w i *,(^), / = 0,l,...,fc-l. 

i i 

Then, for any function *$>k on [A,B], we can conclude that 
(A.12) ^Wi*jfc(ci) < J2u)i^ k (ci) 

i i 

if {^0 = 1) • • • ) ^fc-ij ^k} is a l so a Chebyshev system. 

Proposition 4. Consider functions \Po = 1, on an inter- 
val [A,B]. Compute the corresponding functions fij as in (3.7), but with 
C-22(c) replaced by ^>k, o,nd suppose that fij > 0, I = 1, . . . , k — 1. T/ien 
{1,^1, . . . , is a Chebyshev system if fk,k>0, while {1, . . . , — 
is a Chebyshev system if fk,k < 0. 

Proof. The conclusion for the case /^^ < follows immediately from 
that for fkk > 0, so that we will only focus on the latter. We need to show 
that 

1 1 ••• 1 



(A.13) 



>0 



* fc (z ) ••• ^k(zk) 

for any given A < zq < z± < ■ ■ ■ < Zk < B . Consider (A.13) as a function 
of Zk- The determinant is if Zk = so that it suffices to show that the 
derivative of (A.13) with respect to Zk is positive on (zk-i,B), that is, 



(A.14) 



1 1 



1 

*i(*fc-l) fi,i( z k) 



>0 



for any z/% S (zk-i,B). Now consider (A.14) as a function of and use 

a similar argument. It suffices to show that for Zk-i G (^-2)^)) 

1 1 ••• 

tfl(zb) *i(zi) ••• /l.lfo-l) 



(A.15) 

*fc(^o) *fc(zi) ••• fk,i{zk-x) fk,l{zk) 
Continuing like this, it suffices to show that 

/i,i(^i) /i,i( z 2) ••• fl,l(Zk) 
(A.16) : : ■.. : >0 



>0. 



fk,l(zi) fk,l(z2) 



fk,l(Zk) 
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for any A < z\ < z<i < 
equivalent to 



(A.17) 



<z k <B. Since / M (c) > for c G [A,B], (A.16) is 




>0. 



Recall that the entries in the last k — 1 rows of this matrix are by definition 
simply values of fi^, I = 2, . . . ,k. Hence, applying the same arguments used 
for (A.13) to (A.17) and using that /2.2(c) >0 for c G [A,B], it is sufficient 
to show that 



(A.18) 




f2,2{zk) 



>0. 



Continuing like this, the ultimate sufficient condition is that fk,ki c ) > for 
cG [A-B], which is precisely our assumption. Thus the conclusion follows. 
□ 
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